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Abstract 

The quadratic assignment problem (QAP) is one of the most difficult combinatorial optimization problems. 
One of the most powerful and commonly used heuristics to obtain approximations to the optimal solution 
of the QAP is simulated annealing (SA). We present an efficient implementation of the SA heuristic which 
performs more than 100 times faster then existing implementations for large problem sizes and a large 
number of SA iterations. 
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1. Introduction 

Originally formulated by Koopmans and Beck- 
mann (1957), the QAP is NP-hard and is consid- 
ered to be one of the most difficult problems to be 
solved optimally. It was defined in the following 
context: A set of N facilities are to be located at 
N locations. The quantity of materials which flow 
between facilities i and j is Aij and the distance 
between locations i and j is Bij. The problem is 
to assign to each location a single facility so as to 
minimize (or maximize) the cost 



N N 



p{i)p{j)-> 



=1 i=i 



where p{i) represents the location to which i is as- 
signed. 

The QAP formulation has since been applied to 
such diverse problems as the optimal assignment of 
classes to classrooms (Dickey and Hopkins, 1972), 
design of DNA microarrays (de Carvalho and Rah- 
mann, 2006), cross species gene analysis (Kolar, 
et al., 2008), creation of hospital layouts (Elshafei, 
1977), and assignment of components to locations 
on circuit boards (Steinberg, 1961). 
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In addition to being important in its own right, 
the QAP includes such other combinatorial opti- 
mization problems as the traveling salesman prob- 
lem and graph partitioning as special cases. 

There is an extensive literature that addresses 
the QAP and which is reviewed in Pardalos et al. 
(1994), Cela (1998), Anstreicher (2003), Loiola et 
al. (2007), James et al. (2009) and Burkhard 
et al. (2009). The QAP is exceedingly hard to 
solve optimally. With the exception of specially 
constructed cases, optimal algorithms have solved 
only relatively small instances with A/" < 36. Vari- 
ous heuristic approaches have been developed and 
applied to problems typically of size N ^ 100 or 
less. By contrast, a travelling salesman problem 
consisting of almost 25,000 towns in Sweden has 
been solved exactly (Applegate et al., 2007). 



2. Background 

The simulated annealing heuristic was first ap- 
plied to the QAP by Burkhard and Rendl (1984) 
and was refined by Connolly (1990). The heuristic 
consists of swapping locations of two facilities. Pro- 
posed swaps can either be determined randomly or 
selected according to some sequential enumeration 
of all possible swaps. For each proposed swap, (5, 
the change in cost for the potential swap, is calcu- 
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lated. The swap is made if 5 is negative or if 



e-^/^ > r, 



where T is an analog of temperature in physical sys- 
tems that is slowly decreased according to a spec- 
ified cooling schedule after each iteration and r is 
a uniformly distributed random variable between 
and 1 . Randomly making moves which increase the 
cost is done to help escape from local minima. 

In traditional implementations of the heuristic, 
the cost of making a swap is calculated from scratch 
when the swap is considered in order to determine 
if the swap should be made. The computational 
complexity of this calculation is 0{N); and if the 
SA run is composed of / iterations, the complexity 
is 0{IN). We can write the time needed to execute 
/ iterations as 

Urad = CqNI (1) 

where cq is a constant. 

3. Approach 

The approach we take here is to attempt to re- 
duce the complexity by maintaining a matrix A of 
costs of each possible swap. This approach is moti- 
vated by the work of Taillard (1991), who applied 
a similar approach in the application of another 
heuristic, tabu search, to the QAP. Our A-matrix 
approach is as follows: 

(a) Initialize by creating a matrix A{p^u^v) con- 
taining the cost of swapping u and v for all u 
and V, given a current assignment p. 

(b) In each iteration, retrieve the cost A(p, r, s) of 
the proposed swap (r, s). 

(c) If the proposed swap is not made in a given 
iteration, go to (b). 

(d) If the proposed swap is made in a given itera- 
tion, update A(p, v) to refiect the swap costs 
given the new permutation and go to (b). 

(e) End after / iterations. 

The number of iterations in which a swap is per- 
formed divided by the total number of iterations is 
known as the acceptance rate a{I). 

The computational complexity of our approach 
has the following contributions. 



(i) Retrieving the value of A(p, r, s) is of complex- 
ity 0(1). 

(ii) Assuming an acceptance rate, a(/), the matrix 
A(p, r, 5) must be updated a{I)I times. The 
complexity of updating A{p^u^v) is 0{N'^) as 
described in [Appendix A[ 

Thus the complexity of our approach is 

0(a(/)/7V2) + 0(/) 

and we can write the CPU time of our approach 
as 

tA = cia{I)IN^ + C2I 

where ci and C2 are constants. The performance 
improvement factor of our approach relative to the 
traditional approach is then 



coN 



Co 



tfrad ^ 

tA ~ cia{I)N^ + C2 ^ cia{I)N 



(2) 



for large N. 

From Eq. (j2j) we see that the performance im- 
provement is critically dependent on the acceptance 
rate a{I). The constants cq and ci are indepen- 
dent of the problem instance, depending only on 
the SA implementation; for our implementation 
co/ci ~ 1/3. For our approach to be beneficial, F 
must have a value > 1 and thus a{I) must satisfy 



«(^) ^ 1/3A^. 



(3) 



It is useful to consider an instantaneous acceptance 
rate^ a{i)^ defined as the number of swaps performed 
in a window of iterations around iteration i divided 
by the size of the window. Our numerical exper- 
iments indicate that d{i) decreases with increas- 
ing i as the SA run progresses (see Appendix B). 
For this reason, our program uses the traditional 
approach until an iteration at which a{i) satisfies 
Eq. (j3j) and then begins using the more efficient 
A-matrix approach. 

For tabu search, a swap is made only after all 
possible N{N — l)/2 swaps are analyzed. Thus for 
tabu search, the acceptance rate, a ~ and 
the performance improvement is of 0(N). However, 
for simulated annealing a is a function of the QAP 
instance and the number of iterations performed 
in a run. We know of no method of determining 
the acceptance rate a priori. Hence, in section l4m 
we measure a{I) for different QAP instances and 
various values of / to get a sense of the performance 
improvement we can expect. 
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4. Numerical Results 

We perform numerical experiments on two classes 
of QAP instances: random instances and structured 
instances. The random instances are created in 
the same manner as the Taixxa instances (Taillard, 
1991) in QAPLIB (Burkard et al. 1997); the A 
and B matrices are symmetric with zero diagonal; 
the matrix elements are chosen from independent 
uniform distributions. The structured instances are 
the Taixxxe instances created by Taillard ( Drezner 
et al., 2005) designed to model real- world appli- 
cations which are difficult to treat using iterative 
heuristics such as SA; the instances are composed 
of matrices which are not random They are avail- 
able at jhttp: / /mistic.heig-vd| .ch/taillard/. 

All runs were performed on systems with Intel 
Xeon 2.4 GHz processors. 

4.1. Acceptance Rate 

To get an idea of the performance improvement 
we may expect, we measure the acceptance rate for 
a range of problem sizes and number of iterations 
for the problem instances discussed above. We use 
the C++ implementation written by Taillard that 
implements the SA heuristic of Connolly (1990). 
Taillard's code is available at ^http: / / mistic.heig-vd| 
.ch/taillard/ . 

4-2. Random instances 

In Fig. [IJa), we plot the acceptance rates a{I) 
for the random instances versus the number of iter- 
ations performed for various problem sizes. We note 
that a{I) decreases with increasing / until reaching 
a minimum after which a{I) increases slowly. The 
location of the minimum in a(/) is dependent on N] 
the larger the problem size, the higher the number 
of iterations at which the minimum is reached. In 
[Appendix B.l[ we provide insights into this behav- 
ior. 

We also note that for large numbers of iterations, 
a{I) is lower for larger problem sizes. This is fortu- 
itous because from Eq. (j2j) a lower value of a{I) is 
needed to achieve the same performance improve- 
ment factor if N is increased. 

4-3. Structured instances 

In Fig. [TJb), we plot the acceptance rate for the 
structured instances. The overall behavior is the 
same as for the random instances, but after the min- 
imum is reached, a{I) does not increase with /. In 



[Appendix B.2 , we provide explain this behavior. 
Also for instances of approximately the same size, 
a{I) is higher for the structured instances than for 
the random instances. This implies that the per- 
formance improvement for the structured instances 
will not be as great as for the random instances. 

4.4- Performance Improvement 

We use the Taillard code as a base for our imple- 
mentation described in Section [3l For the random 
instances, in Fig. [2] we plot CPU time vs the num- 
ber of iterations for the traditional approach and for 
our A-matrix approach. Note that the times for the 
traditional approach are linear with the number of 
iterations, consistent with Eq. ([T]). The A-matrix 
approach results are coincident with those of the 
traditional approach until Eq. (|3]) is satisfied, at 
which point it is more efficient to use the A ma- 
trix. From this point on, the A-matrix approach 
CPU times are lower than those of the traditional 
approach. Similar results hold for the structured 
instances as shown in Fig. [3l 

In Fig. m we plot the measured performance 
improvement ttrad/tA for random and structured 
instances. There is no performance improvement 
until the number of iterations ~ 10^. This reflects 
the fact that below this number of iterations Eq. 
(|3|) does not hold and the traditional method is 
used. We see that the relative performance of the 
A-matrix approach improves with increasing prob- 
lem size, from which we can infer from Eq. (|2]) 
that the acceptance rate decreases faster than the 
problem size as the problem size increases. 

5. Discussion and Summary 

We develop an implementation of the simulated 
annealing heuristic for the quadratic assignment 
problem which runs significantly faster than the tra- 
ditional implementation; the simulated annealing 
concept and algorithm for the QAP are unchanged. 
Our approach is motivated by the work of Taillard 
(1991) who took a similar approach in implement- 
ing tabu search for the quadratic assignment prob- 
lem. That the technique has not been applied to SA 
until now may be due to the fact that it does not 
provide significant benefit unless a large number of 
iterations are applied to a large QAP instance; this 
is now possible with the improvement of computing 
capabilities over the last 20 years. 

Being able to perform SA with a large number 
of iterations on large QAP instances is important 
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Figure 1: Acceptance rates for (a) random and (b) structured 
instances versus number of iterations. 



because (a) many real world problems are signifi- 
cantly larger than the size of problems to which SA 
has been traditionally applied and (b) the quality 
of the SA solution improves as the number of itera- 
tions is increased (Paul, 2010). Also, in Paul (2010) 
it was shown that for high quality solutions, the per- 
formance of SA is superior to that of tabu search. 
The current work makes this finding stronger be- 
cause of the improved performance of SA. 
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Figure 2: CPU time versus iterations for random instances 
for traditional implementation (circles) and efficient imple- 
mentation (squares). 
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Figure 3: CPU time versus iterations for structured instances for traditional implementation (circles) and efficient implemen- 
tation (squares). 
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Figure 4: Performance improvement for (a) random in- 
stances and (b) structured instances. 

Appendix A. Complexity of A-Matrix Up- 
date 

Following Taillard (1991), starting from an as- 
signment of facilities p let the resulting assignment 
after swapping facilities r and s be p^ That is: 

= k r^s 

p\r) = p{s) 
p\s) = p{r). 

For a symmetrical matrix with a null- diagonal, the 
cost A(j9, r, 5) of swapping r and 5 is: 

N N 

i=i j^i 

(A.l) 



To calculate A{p' ^u^v) for any u and v with com- 
plexity 0(N) , we can use equation ()A.ip . For asym- 
metric matrices or matrices with non-null diago- 
nals, a slightly more complicated version of Equa- 
tion (jA.ip also of complexity 0(N) is given by 
Burkhard and Rendl (1984). 

In the case that u and v are different from r or 5, 
we can calculate A{p^^u,v) with complexity 0(1) 
with 

A{p\ u, v) = A(p, u, v) + 2{Ar,u - Ar^v + As^y - As^u) 

using the value A{p,u,v) calculated in the previ- 
ous iteration. Since there are 0{N'^) matrix entries 
with u and v different from r or 5, these contribute 
complexity 0{N'^) to the A matrix update. There 
are 0{N) entries with u and v not different from r 
or s each of which requires the 0{N) calculation of 
Eq. (|A.ip . These entries thus also contribute com- 
plexity 0{N'^) and the overall complexity of updat- 
ing A is 0{N'^). 

Appendix B. Acceptance rate behavior 

Because the performance improvement depends 
critically on the acceptance rate as shown in Eq. 
dHJ , here we explore the behavior of the acceptance 
rate as shown in Fig. [H In order to do this, we first 
provide more details of the cooling schedule of the 
SA algorithm (Connolly, 1990) and describe how an 
SA run progresses. 

The details of the cooling schedule are as follows: 
At the start of each heuristic run, the temperature 
is set to a high value Tq. After each iteration, the 
temperature is reduced using the recursive relation 

(B.l) 



1.0 + /3T 



where 
To 



/3 



IToTf 



(B.2) 



to reach a desired final temperature Tf when all it- 
erations are exhausted. Note from Eq. (|B.2p that 
the temperature decreases in smaller steps for larger 
/. The temperature decrease may end before Tf 
is reached if a specified number, Q, of consecutive 
iterations in which no swap is performed is encoun- 
tered. We denote the temperature at which this 
occurs as Tq. 
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At the beginning of an SA run, the initial QAP 
configuration is random and there are relatively 
many moves which will result in a cost decrease. 
As the run progresses, the number of moves which 
will result in a lower cost decreases. Also as the 
run progresses, the decreasing temperature results 
in a lower frequency of swaps with a positive incre- 
mental cost 6. Thus the instantaneous acceptance 
rate a{i) (defined in Section [3|) decreases as the run 
progresses. Eventually if the number of iterations 
/ specified for the run is large enough, the num- 
ber of consecutive iterations in which a swap is not 
performed will exceed the specified limit Q and the 
temperature and a{i) will no longer decrease. 

With this background we now discuss the behav- 
ior of the acceptance rates a{I) shown in Fig. [T]for 
both the random and structured instances. 



Appendix B.l. Random instances 

To study the acceptance rate a{I) at a more gran- 
ular level, for various values of /, Fig. IB.ll plots the 
instantaneous acceptance rate a{i) versus the itera- 
tion number i for = 128; plots for other values of 
N are similar. For / > 10^, the plots for each value 
of / eventually reach constant values ax{I). The 
plots become constant at the iteration at which no 
swaps have been performed in the previous Q itera- 
tions at which point the temperature is fixed at Tq . 
We also note that the constant values ax (!) increase 
with increasing /. This behavior can be understood 
by plotting Tq. For the random N = 128 instance, 
in Fig. IB. 21 we plot Tq as function of / and note 
that Tq increases with increasing /. This refiects 
the fact that because of the the slower cooling for 
larger number of iterations, SA finds a deeper lo- 
cal minima from which it cannot escape easily at 
a higher temperatures for larger /. The increase in 
ax{I) as / increases explains the increase in a{I) for 
larger values of / in Fig. [1] (a). 

For / < 10^, the plots in Fig. IB. II terminate 
at lower and lower values as / increases. This ex- 
plains the initial decrease in the acceptance rate as 
a function of / in Fig. [U^a). The plots do not reach 
a constant value and terminate at a higher value 
than plots with / > 10^. This occurs because de- 
spite the low temperatures at the end of run which 
inhibit cost-raising swaps, there are sufficient num- 
bers of cost-reducing swaps to maintain a relatively 
high value of a{i). 



Appendix B.2. Structured instances 

As with the random instances, in order to un- 
derstand the behavior in Fig. [Hb) we must study 
the instantaneous acceptance rate a{i). For vari- 
ous values of /, Fig. IB. 31 plots a{i) versus the i. 
As opposed to the random instances, all plots de- 
crease monotonically and never level off to a con- 
stant value. This behavior is explained by the fact 
that for the structured instances studied, the limit 
Q on the number of consecutive iterations in which 
no swap is made is never reached; there are always 
low cost swaps which can be made. The tempera- 
ture always decreases to Tf and as the temperature 
decreases a{i) decreases. Because the plots in Fig. 
IB.3f b) for larger / all scale with the number of iter- 
ations, the value reached by the plots in Fig. [TJb) 
remain constant for large /. 

Similar to the behavior for the random instances 
and for the same reason, for / < 10^, the a{i) plots 
terminate at a higher value than plots with / > 10^. 
As with the random instances, this explains the ini- 
tial decrease in the acceptance rate as a function of 
/ in Fig. m b). 

Appendix C. Supplementary material 

The C++ program which implements the SA 
heuristic described in this article can be found, in 
the online version, at doiixxxxxx. 
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